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Abstract Variance and Fisher information are ingredients of the Cramer-Rao in- 
equality. We regard Fisher information as a Riemannian metric on a quantum sta- 
tistical manifold and choose monotonicity under coarse graining as the fundamental 
property of variance and Fisher information. In this approach we show that there is 
a kind of dual one-to-one correspondence between the candidates of the two concepts. 
We emphasis that Fisher informations are obtained from relative entropies as contrast 
functions on the state space and argue that the scalar curvature might be interpreted 
as an uncertainty density on a statistical manifold. 

On the one hand standard quantum mechanics is a statistical theory, on the other 
hand, there is a so-called geometrical approach to mathematical statistics [0, In this 
paper the two topics are combined and the concept of covariance and Fisher information 
is studied from an abstract poit of view. We start with the Cramer-Rao inequality to 
realize that the two concepts are very strongly related. What they have in common is 
a kind of monotonicity property under coarse grainings. (Formally the monotonicity 
of covariance is a bit difference from that of Fisher information.) Monotone quantities 
of Fisher information type determine a superoperator J which gives immediately a 
kind of generalized covariance. In this way a one-to-one correspondence is established 
between the candidates of the two concepts. In the paper we prove a Cramer- Rao type 
inequality in the setting of generalized variance and Fisher information. Moreover, 
we argue that the scalar curvature of the Fisher information Riemannian metric has 
a statistical interpretation. This gives interpretation of an earlier formulated but still 
open conjecture on the monotonicity of the scalar curvature. 



1 The Cramer-Rao inequality for an introduction 



The Cramer-Rao inequality belongs to the basics of estimation theory in mathematical 
statistics. Its quantum analog was discovered immediately after the foundation of 



mathematical quantum estimation theory in the 1960 's, see the book [T^ of Helstrom, 
or the book WM of Holevo for a rigorous summary of the subject. Although both the 
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classical Cramer-Rao inequality and its quantum analog are as trivial as the Schwarz 
inequality, the subject takes a lot of attention because it is located on the highly 
exciting boundary of statistics, information and quantum theory. 

As a starting point we give a very general form of the quantum Cramer- Rao inequal- 
ity in the simple setting of finite dimensional quantum mechanics. For 9 G (— e, e) C M 
a statistical operator Dq is given and the aim is to estimate the value of the parameter 
9 close to 0. Formally is an x n positive semidefinite matrix of trace 1 which 
describes a mixed state of a quantum mechanical system and we assume that Dq is 
smooth (in 9). In our approach we deal with mixed states contrary to several other 
authors, see for example. Assume that an estimation is performed by the measure- 
ment of a selfadjoint matrix A playing the role of an observable. A is called locally 
unbiased estimator if 

d 



= 1, (1) 
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This condition holds if A is an unbiased estimator for 9, that is 

1iDeA = 9 {9e{~e,e)). (2) 

To require this equality for all values of the parameter is a serious restriction on the 
observable A and we prefer to use the weaker condition (|l]). 

Let v5o[ ■ ; ■ ] be an inner product on the linear space of selfadjoint matrices. - , ■] 
depends on the density matrix Dq, the notation reflects this fact. When Dg is smooth 
in 6', as already was assumed above, the correspondence 



(3) 



is a linear functional on the selfadjoint matrices and it is of the form ipQ[B,L\ with 
some L = L*. From (|l|) and (H) we have v9o[A,L] = 1 and the Schwarz inequality 
yields 

MA,A]>-^. (4) 

This is the celebrated inequality of Cramer-Rao type for the locally unbiased estimator. 
We want to interprete the left-hand-side as a generalized variance of A. The right- 
hand-side of is independent of the estimator and provides a lower bound for the 
generalized variance. The denominator (fo[L,L] appears to be in the role of Fisher 
information here. We call it quantum Fisher information with respect to the 
generalized variance ipq[- , ■]. This quantity depends on the tangent of the curve Dg. 

We want to conclude from the above argument that whatever Fisher information 
and generalized variance are in the quantum mechanical setting, they are very strongly 



related. In an earlier work (|^0|, ^) we used a monononicity condition to make a 
limitation on the class of Riemannian metrics on the state space of a quantum system. 
The monotone metrics are called Fisher information quantities in this paper. Now we 
observe that a similar monotonicity property can be used to get a class of bilinear 



forms, we call the elements of this class generalized variances. The usual variance of 
two observables is included but many other quantities as well. We descibe a one-to- 
one correspondence beween variances and Fisher informations. The correspondence is 
given by a superoperator J which appears immediately in the analysis of the inequality 

(I)- 

Since the sufficient and necessary condition for the equality in the Schwarz inequality 
is well-known, we are able to analyze the case of equality in (^. The condition for 
equality is 

A = \L 

for some constant A G M. On the nxn selfadjoint matrices we have two inner products: 
ipo[- , ■] and {A,B) := Ti AB. There exists a linear operator Jo on the selfadjoint 
matrices such that 

^o[A,B] = TtAUB). 
Therefore the necessary and sufficient condition for equality in (11) is 
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A~^JIo(^) • (5) 



Therefore there exists a unique locally unbiased estimator A = AJIq^(-Do)) where the 
number A is chosen such a way that the condition ([^) should be satisfied. 



2 Coarse graining and Fisher information 

In the simple setting in which the state is described by a density matrix, a coarse 
graining is an affine mapping sending density matrices into density matrices. Such a 
mapping extends to all matrices and provides a positivity and trace preserving linear 
transformation. A common example of coarse graining sends the density matrix D12 of 
a composite system 1-1-2 into the (reduced) density matrix Di of component 1. (There 
are several reasons to assume completely positivity about a coarse graining but now 
we do not consider this issue.) 

Assume that Dg is a smooth curve of density matrices with tangent A := Dq a.t 
Dq. The quantum Fisher information FdIA) is an information quantity associated 
with the pair [Dq, A), it appeared in the Cramer- Rao inequality above and the Fisher 
information gives a bound for the (generalized) variance of a locally unbiased estimator. 
Let now a be a coarse graining. Then a{Dg) is another curve in the state space. Due to 
the linearity of a, the tangent at a(Do) is a{A). As it is usual in statistics, information 
cannot be gained by coarse graining, therefore we expect that the Fisher information 
at the density matrix Dq in the direction A must be larger than the Fisher information 
at a (-Do) in the direction a{A). This is the monotonicity property of the Fisher 
information under coarse graining: 

Fd{A) > F^(^oMA)) (6) 
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Although we do not want to have a concrete formula for the quantum Fisher informa- 
tion, we require that this monotonicity condition must hold. Another requirment is 
that F£,{A) should be quadratic in A, in other words there exists a nondegenerate real 
bilinear form jijIAjB) on the selfadjoint matrices such that 

FniA)=^niAA). (7) 

The requirements (|^) and are strong enough to obtain a reasonable but still wide 
class of possible quantum Fisher informations. 

We may assume that 

^niAB) = TiAS^\B*). (8) 

for an operator J^) acting on matrices. (This formula expresses the inner product 7^) 
by means of the Hilbert-Schmidt inner product and the positive linear operator J^).) 
In terms of the operator J/? the monotonicity condition reads as 

«*J;^,,)« < JB' (9) 

for every coarse graining a. {a* stand for the adjoint of a with respect to the Hilbert- 
Schmidt product. Recall that a is completely positive and trace preserving if and only 
if a* is completely positive and unital.) On the other hand the latter condition is 
equivalent to 

alna* < Ia{D) ■ (10) 



We proved the following theorem in [pO], see also 



Theorem 2.1 If for every density matrix D a positive definite bilinear form 7/3 is 
given such that ^ holds for all completely positive coarse grainings a and ■yD{A,A) 
is continuous in D for every fixed A, then there exists a unique operator monotone 
function / : IR+ ^ R such that f{t) = tf{t'^) and 7d(A, A) is given by the following 
prescription. 

7^(/l, A) = TrA]l^\A) and = R]^^ f (hoMnVji^ , 

where the linear transformations L,d and M.£, acting on matrices are the left and right 
multiplications, that is 

Lb(X) = DX and RniX) = XD . 



Although the statement of the theorem seems to be rather complicated, the formula 
for F£,{A) = 'Jd{A,A) becomes simpler when D and A commute. On the subspace 
{A : AD = DA} the left multiplication L^) coincides with the right one M^i and 
/(L^jlR^^) = /(I). Therefore we have 

Fd{A) = -^TiD'^A^ if AD = DA. (11) 



Under the hypothesis of commutation the quantum Fisher information is unique up to 
a constant factor. (This fact reminds us the Cencov uniqueness theorem in the Kol- 
mogorovian probabihty, |^ . According to this theorem the metric on finite probabihty 
spaces is unique when monotonicity under Markovian kernels is posed.) We say that 
the quantum Fisher information is classically Fisher-adjusted if 

FoiA) = TiD-^A'^ when AD = DA. (12) 

This means that we impose the normalization /(I) = 1 on the operator monotone 
function. In the sequel we always assume this condition. 

Via the operator Sd, each monotone Fisher information determines a quantity 

^d[A,A]:=TtASd{A) (13) 



which could be called generalized variance. According to (10) this possesses the 
monotonicity property 

Mc^*iA),a*iA)]<^^^D)[AA]. (14) 



Since (^ and (11^) are equivalent we observe a one-to-one correspondence between 
monotone Fisher informations and monotone generalized variances. Any such 
variance has the property A] = Ti DA^ for commuting D and A. The examples 

below show that it is not so generally. 



The analysis in led to the fact that among all monotone quantum Fisher infor- 
mations there is a smallest one which corresponds to the function fm{t) = (1 + 1)/2. 
In this case 

F^'^'iA) = TtAL = Tr DL"^, where DL + LD = 2A. (15) 

For the purpose of a quantum Cramer-Rao inequality the minimal quantity seems to 
be the best, since the inverse gives the largest lower bound. In fact, the matrix L has 
been used for a long time under the name of symmetric logarithmic derivative, 
see [0] and [0. In this example the generalized covariance is 

^d[A,B] = ITtD{AB + BA) (16) 

and we have 

JJ^(A) = 1{DA + AD) and S^\A) = L = 2 e~'^Ae~'^ dt (17) 
for the superoperator J of the previous section. 

The set of invertible n x n density matrices is a manifold of dimension — 1. 
Indeed, parametrizing these matrices by n — 1 real diagonal entries and {n — l)n/2 
upper diagonal complex entries we have — 1 real parameters which run over an open 
subset of the Euclidean space M" Since operator monotone function are smooth 
(even analytic), all the quantities in Theorem |2.1| endow the manifold of density 
matrices with a Riemannian structure. 



3 Garden of monotone metrics 



All the monotone quantum Fisher information quantities in the range of the previous 
theorem are depending smoothly on the footpoint density D and hence they endow the 
state space with a Riemannian structure. In particular, the Riemannian geometry 
of the minimal Fisher information was the subject of the paper 

It is instructive to consider the state space of a 2-level quantum system in de- 
tails. Dealing with 2x2 density matrices, we conveniently use the so-called Stokes 
parametrization. 

Dx = 1(1 + xiai + X2a2 + x^as) = ^{I + x ■ a) (18) 

where cxi, cr2, era are the Pauli matrices and (xi, X2, x^) G with xf + xl + x^ < 1. A 
monotone Fisher information on A^2 is rotation invariant in the sense that it depends 
only on r = ^^/ x'^ + y'^ + z"^ and splits into radial and tangential components as follows. 

as =- ^ar +- gl- mn , where g[t) = ——- . (19) 



r2 ' i + r^Vl + r/ ' ' f{t) ' 

The radial component is independent of the function /. (This fact is again a reminder 
of the Cencov uniqueness theorem.) The limit of the tangential component exists in 



(|T9D when r — > 1 provided that /(O) 7^ 0. In this way the standard Fubini-Study metric 
is obtained on the set of pure states, up to a constant factor. (In case of larger density 
matrices, pure states form a small part of the topological boundary of the invertible 
density matrices. Hence, in order to speak about the extension of a Riemannian metric 
on invertible densities to pure states, a rigorous meaning of the extension should be 
given. This is the subject of the paper see also |2^.) Besides minimality the radial 



extension yields another characterization of the minimal quantum Fisher information, 

see 
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Theorem 3.1 Among the monotone quantum Fisher informations the minimal one 
(given by ^T^)) is characterized by the properties that it is classically Fisher- adjusted 
(in the sense of ^TM)) o,nd its radial limit is the Fubini-Study metric on pure states. 

We note that in the minimal case fm{t) = (t + l)/2 we have constant tangential 
component in (^): 

ds^ = -^-^dr^ + dn^ . (20) 
\ — 



The metric (|T5D is widely accepted in the role of quantum Fisher information, see 



However, some other operator monotone functions may have importance. Let us see 
first the other extreme. According to |^ there is a largest metric among all monotone 
quantum Fisher informations and this corresponds to the function /^/(t) = 2t/(l +t). 
In this case 

J-i(A) = \{D-^A + AD-^) and F^'^^(A) = Tr Z^'M^ (21) 



The maximal metric cannot be extended to pure states. 
It can be proved that the function 



(22) 



is operator monotone. This was done for the case < /3 < 1 in and the case — 1 < 
(3 < was treated in |T^. (The operator monotonicity follows also from ( |5B| ) below.) 
We denote by the corresponding Fisher information metric. When A = i[D, B] is 
orthogonal to the commutator of the footpoint D in the tangent space, we have 

F'niA) = ^^^/_^^ Tr([Z^M?][Z^^-M?]). (23) 

Apart from a constant factor this expression is the skew information proposed by 
Wigner and Yanase some time ago ([^). In the limiting cases /? — > or 1 we have 

foix) = ] — - 
logx 

and the corresponding metric 



/■oo 

Kd{A,B):= 1iA{D + t)-^B{D + t)-^dt 
Jo 



(24) 



is named after Kubo, Mori, Bogoliubov etc. The Kubo-Mori inner product plays a role 
in quantum statistical mechanics (see 0], for example). In this case 



/•oo /»1 

-\B)= / {D + t)-^B{D + t)-Ut and J(A) = / D'AD^-'dt. 
Jo Jo 



(25) 



Therefore the corresponding generalized variance is 



^d{A,B)= [ TiAD'BD^-'dt. (26) 
Jo 

Beyond the affine parametrization of the set of density matrices, the exponential 
parametrization is another possibility: Any density matrix is written in a unique 
way in the form e^/Tre^, where if is a selfadjoint traceless matrix. In the affine 
parametrization the integral (plj) gives the metric and (|26|) is the corresponding vari- 
ance. If we change for the exponential parametrization, the role of the two formulas is 



interchanged: integral (24) gives the variance and (^) is the metric. (The reason for 
this fact that the change of the coordinates is described by J from (P5|).) The affine and 
exponential parametrization is the subject of the paper and the characterization of 
the Kubo-Mori metric in |]TD| is probably another form of the duality observed between 
and (Wi). 



4 The Cramer-Rao inequalities revisited 



Let A4 := {Dg : ^ e G} be a smooth m- dimensional manifold, parametrized in such 
a way that G G C M™. A (locally) unbiased estimator of ^ at ^ = is a collection 
A — {Ai, . . . , Am) of selfadjoint matrices, such that 

(i) Tr DoAi = for all 1 < i < m, 

(ii) ^Tr DoAj\g.=o = 5ij for all i, j = 1, . . . , m. 

Suppose a generalized variance ipo is given. Then the generalized covariance matrix 
of the estimator A is a positive definite matrix, defined by </?o[>l]ij = ^PoiAi, Aj]. If 



ei=o 



determines the logarithmic derivatives Lj, then 

ipo[Ai, Lj] ^ 5ij = l,...,m). 

This orthogonality relation implies a matrix inequality for the Gram matrices which is 
an inequality of Cramer- Rao type. 



Theorem 4.1 Let A = {Ai, . . . , Am) be a locally (at 9 — 0) unbiased estimator of 9, 
moreover Lj and ifo be as above. Then 

<Pd[A] > (^{ipo[L^,LJ])^^^ ' 

in the sense of the order on positive definite matrices. 



The proof is rather simple if we use the block matrix method. Let X and B he mxm 
matrices with nxn entries and assume that all entries of B are constant multiples of the 
unit matrix. {A^ and Lj) are nxn matrices.) If a is a completely positive mapping on 
nxn matrices, then a :— Diag (a, . . . , a) is a positive mapping on block matrices and 
a{BX) = Ba{X). This implies that TiXa{X*)B > when B is positive. Therefore 
the mxm ordinary matrix M which has ij entry 



Tr{Xa{X*)) 



is positive. In the sequel we restrict ourselves for m 
apply the above fact to the case 



2 for the sake of simplicity and 



X 



Ai 

^2 

Li 

L2 



and a — S 



D ■ 



Then we have 



M 



TiAiSd{Ai) TrAiJz)(A2) Ti A Jd{Li) TtAJd{L2) 

TiA^lDiAi) TiA2^DiA2) TrA2]5DiLi) TrA2JD(i^2) 

TtLJd{Ai) TtLJd{A2) TtLJd{Li) TtLJd{L2) 

TrL2JD(^i) TtL2Sd{A2) TrLaJDj^i) TtL2]5d{L2) 



> 



Now we rewrite the matrix M in terms of a generahzed variance (po and apply the 
orthogonahty assumption. We get 



M 



ifolAuA^] ipo[AuA2] 

1 
1 



Since the positivity of a block matrix 



M 



'Ml I ' 




I M2 _ 





1 
1 

fo[Li,Li] ipQ[Li,L2] 

(po[L2,Li] (po[L2,L2] 



I {^D[Li,Lj]ij] 



> 



implies Mi > M2 ^ we have exactly the statement of our Cramer-Rao inequality. 



5 Statistical distinguishability and uncertainty 

Assume that a manifold Ai := {Dg : 9 G G} of density matrices is given together 
a statistically relevant Riemannian metric 7^. We do not give a formal definition of 
such a metric. What we have in mind is the property that given two points on the 
manifold their geodesic distance is interpreted as the statistical distinguishability of 
the two density matrices in some statistical procedure. 

Let Dq E M. he a, point on our statistical manifold. The geodesic ball 

B,{D^) := {D e M : d{Do, D) < e} 

contains all density matrices which can be distinguished by an effort smaller than e from 
the fixed density Do. The size of the inference region Bs{Dq) measures the statistical 
uncertainty at the density Dq. Following Jeffrey's rule the size is the volume measure 
determined by the statistical (or information) metric. More precisely, it is better to 
consider the asymptotics of the volume of B^{Dq) as £ ^ 0. According to differential 
geometry 

Vol{B,{Do)) = CnS- - — ^Scal (^0)^"+' + (27) 

6(n + 2) 

where n is the dimension of our manifold, Cn is a constant (equals to the volume of 
the unit ball in the Euclidean n-space) and Seal means the scalar curvature, see 3.98 
Theorem in In this way, the scalar curvature of a statistically relevant Riemannian 



n 



metric might be interpreted as the average statistical uncertainty of the density 
matrix (in the given statistical manifold). This interpretation becomes particularly 
interesting for the full state space endowed by the Kubo-Mori inner product as a 
statistically relevant Riemannian metric. 

Let Ai be the manifold of all invertible n x n density matrices. The Kubo-Mori (or 
Bogoliubov) inner product is given by 

'jn{A,B)=TT{dAD){dB\ogD). (28) 

In particular, in the affine parametrization we have 

poo 

jn{A,B)= TTA{D + t)-'B{D + t)-\ (29) 
Jo 



see 



19|. On the basis of numerical evidences it was conjectured in [T^ that the scalar 
curvature which is a statistical uncertainity is monotone in the following sense. For 
any coarse graining a the scalar curvature at a density D is smaller than at a{D). 
The average statistical uncertainty is increasing under coarse graining. Up to now this 
conjecture has not been proven mathematically. Another form of the conjecture is the 
statement that along a curve of Gibbs states 



the scalar curvature changes monotonly with the inverse temperature /3 > 0, that is, 
the scalar curvature is monotone decreasing function of p. 



6 Relative entropy as contrast function 

Let Dg be a smooth manifold of density matrices. The following construction is moti- 
vated by classical statistics. Suppose that a nonnegative functional d{Di,D2) of two 
variables is given on the density matrices. In many cases one can get a Riemannian 
metric by differentiation: 

To be more precise the nonnegative smooth functional d{ - , ■) is called a contrast 
functional if d{Di,D2) = iplies Di = D2. (For the role of contrast functionals 
in classical estimation, see 0.) We note that a contrast functional is a particular 
example of yokes, cf. |@. 

Following the work of Csiszar in classical information theory, Petz introduced a 
family of information quantities parametrized by a function F : M"*" — > M 

SpiD^, D2) = Tr {D\'^F{/\n,,D,)D\'^). (30) 



1 n 



see |T8|, or |jT^ p. 113. Here ^d2,d^ '■= Ln^R^^ is the relative modular operator 
of the two densities. When F is operator convex, this quasi-entropy possesses good 
properties, for example it is a contrast functional in the above sense if F is not linear. 
In particular for 



4 



we have 



By differentiating we get 



1-^2 

4 



(1 _ t(l+-)/2) 



1 — a' 



l + g 1 + a 



Sc,{D + tA,D + uB] 



t=u=0 



(31) 



(32) 



dtdu 

which is related to (|^) as 

F^{A) = K-^iA, A) and /3 = (1 - a)/2. 

Ruskai and Lesniewski discovered that all monotone Fisher informations are obtained 



from a quasi-entropy as contrast functional [jT6| . The relation of the function F in 
to the function / in Theorem p.l| is 



1 F{t)+tF{t'^] 



fit) 



{t - ly 



(33) 
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